NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Morphometrics and Phylogenomics of Coca ( Erythroxylum spp.) Illuminate Its Reticulate Evolution, With Implications for Taxonomy

https://doi.org/10.1093/molbev/msae114

Przelomska, Natalia_A S; Diaz, Rudy A; Ávila, Fabio Andrés; Ballen, Gustavo A; Cortés-B, Rocío; Kistler, Logan; Chitwood, Daniel H; Charitonidou, Martha; Renner, Susanne S; Pérez-Escobar, Oscar A; et al (July 2024, Molecular Biology and Evolution)
Ouangraoua, Aida (Ed.)
Abstract South American coca (Erythroxylum coca and E. novogranatense) has been a keystone crop for many Andean and Amazonian communities for at least 8,000 years. However, over the last half-century, global demand for its alkaloid cocaine has driven intensive agriculture of this plant and placed it in the center of armed conflict and deforestation. To monitor the changing landscape of coca plantations, the United Nations Office on Drugs and Crime collects annual data on their areas of cultivation. However, attempts to delineate areas in which different varieties are grown have failed due to limitations around identification. In the absence of flowers, identification relies on leaf morphology, yet the extent to which this is reflected in taxonomy is uncertain. Here, we analyze the consistency of the current naming system of coca and its four closest wild relatives (the “coca clade”), using morphometrics, phylogenomics, molecular clocks, and population genomics. We include name-bearing type specimens of coca's closest wild relatives E. gracilipes and E. cataractarum. Morphometrics of 342 digitized herbarium specimens show that leaf shape and size fail to reliably discriminate between species and varieties. However, the statistical analyses illuminate that rounder and more obovate leaves of certain varieties could be associated with the subtle domestication syndrome of coca. Our phylogenomic data indicate extensive gene flow involving E. gracilipes which, combined with morphometrics, supports E. gracilipes being retained as a single species. Establishing a robust evolutionary-taxonomic framework for the coca clade will facilitate the development of cost-effective genotyping methods to support reliable identification.
more » « less
Full Text Available
Novel symmetry-preserving neural network model for phylogenetic inference

https://doi.org/10.1093/bioadv/vbae022

Tang, Xudong; Zepeda-Nuñez, Leonardo; Yang, Shengwen; Zhao, Zelin; Solís-Lemus, Claudia (February 2024, Bioinformatics Advances)
Ouangraoua, Aida (Ed.)
Abstract Scientists world-wide are putting together massive efforts to understand how the biodiversity that we see on Earth evolved from single-cell organisms at the origin of life and this diversification process is represented through the Tree of Life. Low sampling rates and high heterogeneity in the rate of evolution across sites and lineages produce a phenomenon denoted “long branch attraction” (LBA) in which long non-sister lineages are estimated to be sisters regardless of their true evolutionary relationship. LBA has been a pervasive problem in phylogenetic inference affecting different types of methodologies from distance-based to likelihood-based. Here, we present a novel neural network model that outperforms standard phylogenetic methods and other neural network implementations under LBA settings. Furthermore, unlike existing neural network models in phylogenetics, our model naturally accounts for the tree isomorphisms via permutation invariant functions which ultimately result in lower memory and allows the seamless extension to larger trees.
more » « less
Analysis of Fungal Genomes Reveals Commonalities of Intron Gain or Loss and Functions in Intron-Poor Species

https://doi.org/10.1093/molbev/msab094

Lim, Chun Shen; Weinstein, Brooke N; Roy, Scott W; Brown, Chris M (March 2021, Molecular Biology and Evolution)
Ouangraoua, Aida (Ed.)
Abstract Previous evolutionary reconstructions have concluded that early eukaryotic ancestors including both the last common ancestor of eukaryotes and of all fungi had intron-rich genomes. By contrast, some extant eukaryotes have few introns, underscoring the complex histories of intron–exon structures, and raising the question as to why these few introns are retained. Here, we have used recently available fungal genomes to address a variety of questions related to intron evolution. Evolutionary reconstruction of intron presence and absence using 263 diverse fungal species supports the idea that massive intron reduction through intron loss has occurred in multiple clades. The intron densities estimated in various fungal ancestors differ from zero to 7.6 introns per 1 kb of protein-coding sequence. Massive intron loss has occurred not only in microsporidian parasites and saccharomycetous yeasts, but also in diverse smuts and allies. To investigate the roles of the remaining introns in highly-reduced species, we have searched for their special characteristics in eight intron-poor fungi. Notably, the introns of ribosome-associated genes RPL7 and NOG2 have conserved positions; both intron-containing genes encoding snoRNAs. Furthermore, both the proteins and snoRNAs are involved in ribosome biogenesis, suggesting that the expression of the protein-coding genes and noncoding snoRNAs may be functionally coordinated. Indeed, these introns are also conserved in three-quarters of fungi species. Our study shows that fungal introns have a complex evolutionary history and underappreciated roles in gene expression.
more » « less
Full Text Available
CoreCruncher : Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets

https://doi.org/10.1093/molbev/msaa224

Harris, Connor D; Torrance, Ellis L; Raymann, Kasie; Bobay, Louis-Marie (September 2020, Molecular Biology and Evolution)
Ouangraoua, Aida (Ed.)
Abstract The core genome represents the set of genes shared by all, or nearly all, strains of a given population or species of prokaryotes. Inferring the core genome is integral to many genomic analyses, however, most methods rely on the comparison of all the pairs of genomes; a step that is becoming increasingly difficult given the massive accumulation of genomic data. Here, we present CoreCruncher; a program that robustly and rapidly constructs core genomes across hundreds or thousands of genomes. CoreCruncher does not compute all pairwise genome comparisons and uses a heuristic based on the distributions of identity scores to classify sequences as orthologs or paralogs/xenologs. Although it is much faster than current methods, our results indicate that our approach is more conservative than other tools and less sensitive to the presence of paralogs and xenologs. CoreCruncher is freely available from: https://github.com/lbobay/CoreCruncher. CoreCruncher is written in Python 3.7 and can also run on Python 2.7 without modification. It requires the python library Numpy and either Usearch or Blast. Certain options require the programs muscle or mafft.
more » « less
Full Text Available

Search for: All records